161 research outputs found

    Small steps and giant leaps: Minimal Newton solvers for Deep Learning

    Full text link
    We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues with current second-order solvers, which invert an approximate Hessian matrix every iteration exactly or by conjugate-gradient methods, a procedure that is both costly and sensitive to noise. Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration. This estimate has the same size and is similar to the momentum variable that is commonly used in SGD. No estimate of the Hessian is maintained. We first validate our method, called CurveBall, on small problems with known closed-form solutions (noisy Rosenbrock function and degenerate 2-layer linear networks), where current deep learning solvers seem to struggle. We then train several large models on CIFAR and ImageNet, including ResNet and VGG-f networks, where we demonstrate faster convergence with no hyperparameter tuning. Code is available

    End-to-end representation learning for Correlation Filter based tracking

    Full text link
    The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designed or trained for a different task. This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network. This enables learning deep features that are tightly coupled to the Correlation Filter. Experiments illustrate that our method has the important practical benefit of allowing lightweight architectures to achieve state-of-the-art performance at high framerates.Comment: To appear at CVPR 201

    Learning feed-forward one-shot learners

    Full text link
    One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark.Comment: The first three authors contributed equally, and are listed in alphabetical orde

    Extracting Reward Functions from Diffusion Models

    Full text link
    Diffusion models have achieved remarkable results in image generation, and have similarly been used to learn high-performing policies in sequential decision-making tasks. Decision-making diffusion models can be trained on lower-quality data, and then be steered with a reward function to generate near-optimal trajectories. We consider the problem of extracting a reward function by comparing a decision-making diffusion model that models low-reward behavior and one that models high-reward behavior; a setting related to inverse reinforcement learning. We first define the notion of a relative reward function of two diffusion models and show conditions under which it exists and is unique. We then devise a practical learning algorithm for extracting it by aligning the gradients of a reward function -- parametrized by a neural network -- to the difference in outputs of both diffusion models. Our method finds correct reward functions in navigation environments, and we demonstrate that steering the base model with the learned reward functions results in significantly increased performance in standard locomotion benchmarks. Finally, we demonstrate that our approach generalizes beyond sequential decision-making by learning a reward-like function from two large-scale image generation diffusion models. The extracted reward function successfully assigns lower rewards to harmful images

    Invariant Information Clustering for Unsupervised Image Classification and Segmentation

    Full text link
    We present a novel clustering objective that learns a neural network classifier from scratch, given only unlabelled data samples. The model discovers clusters that accurately match semantic classes, achieving state-of-the-art results in eight unsupervised clustering benchmarks spanning image classification and segmentation. These include STL10, an unsupervised variant of ImageNet, and CIFAR10, where we significantly beat the accuracy of our closest competitors by 6.6 and 9.5 absolute percentage points respectively. The method is not specialised to computer vision and operates on any paired dataset samples; in our experiments we use random transforms to obtain a pair from each image. The trained network directly outputs semantic labels, rather than high dimensional representations that need external processing to be usable for semantic clustering. The objective is simply to maximise mutual information between the class assignments of each pair. It is easy to implement and rigorously grounded in information theory, meaning we effortlessly avoid degenerate solutions that other clustering methods are susceptible to. In addition to the fully unsupervised mode, we also test two semi-supervised settings. The first achieves 88.8% accuracy on STL10 classification, setting a new global state-of-the-art over all existing methods (whether supervised, semi-supervised or unsupervised). The second shows robustness to 90% reductions in label coverage, of relevance to applications that wish to make use of small amounts of labels. github.com/xu-ji/IICComment: International Conference on Computer Vision 201

    Análise Económica - Financeira das 3 Grandes Sociedades Desportivas (Sporting, Benfica e Porto)

    Get PDF
    O futebol é um desporto, considerado como um fenómeno a nível mundial, porque molda culturas, atrai multidões e quebra barreiras culturais e sociais. Contudo, nos últimos anos, aumentou a importância a nível económico e, por isso, a Union of European Football Associations (UEFA) foi forçada a implementar mudanças desportivas, em geral, com exigências ao nível do Fair Play Financeiro (FFP), em particular, tendo como foco os aspetos económicos e financeiros. Face ao referido justifica-se que o projeto aplicado tenha como objetivo a análise económica e financeira das três Sociedades Anónimas Desportivas que detêm em Portugal mais títulos desportivos, isto é: Sporting Clube de Portugal – Futebol, SAD, Sport Lisboa e Benfica – Futebol SAD e o Futebol Clube do Porto – Futebol, SAD Metodologicamente, a primeira parte do projeto aplicado centrou-se numa revisão da literatura sobre o futebol e as Sociedades Anónimas Desportiva (SAD), bem como a análise de leis e normas que regem estas sociedades e, ainda, a prestação de contas das mesmas. Na segunda parte do projeto aplicado desenvolve-se uma análise empírica que permitiu realizar uma avaliação individual longitudinal com uma comparação das três SAD, justificadas num período de estudo de oito épocas desportivas que compreende a época de 2007/2008 a 2014/2015. Os resultados do projeto aplicado permitiram concluir que cada Sociedade Anónima Desportiva representa e interpreta a sua própria prestação de contas, de uma maneira relevante e face aos resultados desportivos alcançados. Assim, com base na classificação desportiva dos clubes na UEFA, foi elaborado um indicador para comparar os resultados desportivos e económicos e o modo como uns podem influenciar os outros. Paralelamente, foram realizadas análises económicas e financeiras das três sociedades obtendo assim uma perspetiva da posição financeira e da sua performance no período em estudo de cada uma das SAD, de modo a identificar o equilíbrio e a sua sustentabilidade

    Influence of Drying Treatment on Physical Properties of Pumpkin

    Get PDF
    The aim of this work was to evaluate the properties of pumpkin (Cucurbita maxima L.) exposed to convective air drying and freeze-drying. The samples were analyzed in terms of physical properties (colour and texture). The trials in the convective chamber were done at 40 ºC and 60 ºC, in the drying tunnel at 60 ºC and in the freeze dryer at -50 ºC. It was concluded that the freeze drying and the air drying at 40 ºC produced smaller changes in the colour while the drying in the tunnel originated more intense colour changes. With respect to texture, it was possible to deduce that the pulp in the fresh product at 2 cm off from the skin is harder than the pulp at 4 cm off from the skin. As to the effect of drying in the texture of the pumpkin, it was observed that all dryings affected texture considerably when compared to the fresh product. In fact, hardness varied from 75 % in the drying in chamber at 40 ºC to 90 % in the tunnel drying, when compared to the fresh product. As to springiness, it was changed more in the drying at 40 ºC, while cohesiveness showed the higher change in the freeze drying treatment
    • …
    corecore